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Abstract: When extracting discriminative features from multimodal data, current methods 
rarely concern themselves with the data distribution. In this paper, we present an 
assumption that is consistent with the viewpoint of discrimination, that is, a person's 
overall biometric data should be regarded as one class in the input space, and his different 
biometric data can form different Gaussians distributions, i.e., different subclasses. Hence, 
we propose a novel multimodal feature extraction and recognition approach based on 
subclass discriminant analysis (SDA). Specifically, one person's different bio-data are 
treated as different subclasses of one class, and a transformed space is calculated, where 
the difference among subclasses belonging to different persons is maximized, and the 
difference within each subclass is minimized. Then, the obtained multimodal features are 
used for classification. Two solutions are presented to overcome the singularity problem 
encountered in calculation, which are using PCA preprocessing, and employing the 
generalized singular value decomposition (GSVD) technique, respectively. Further, we 
provide nonlinear extensions of SDA based multimodal feature extraction, that is, the 
feature fusion based on KPCA-SDA and KSDA-GSVD. In KPCA-SDA, we first apply 
Kernel PCA on each single modal before performing SDA. While in KSDA-GSVD, we 



Sensors 2012, 12 



5552 



directly perform Kernel SDA to fuse multimodal data by applying GSVD to avoid the 
singular problem. For simplicity two typical types of biometric data are considered 
in this paper, i.e., palmprint data and face data. Compared with several representative 
multimodal biometrics recognition methods, experimental results show that our approaches 
outperform related multimodal recognition methods and KSDA-GSVD achieves the best 
recognition performance. 

Keywords: multimodal biometric feature extraction; palmprint and face; subclass 
discriminant analysis (SDA); generalized singular value decomposition (GSVD); kernel 
subclass discriminant analysis (KSDA) 



1. Introduction 

Multimodal biometric recognition techniques use multi-source features together in order to obtain 
integrated information to obtain more essential data about the same object. This is an active research 
direction in the biometric community, for it could overcome many problems that bother traditional 
single-modal biometric system, such as the instability in one's feature extraction, noisy sensor data, 
restricted degree of freedom, and unacceptable error rates. Information fusion is usually conducted on 
three levels, i.e., pixel level [1,2], feature level [3-5] and decision level [6-9]. The former two levels 
mainly aim at learning descriptive features, while the last level aims at finding a more effective way to 
use learned features for decision making. Especially, at the pixel level and feature level, 
discriminant analysis technique always plays an important role to acquire more descriptive or more 
discriminative features. 

Linear discriminant analysis (LDA) is a popular and widely used supervised discriminant analysis 
method [10]. LDA calculates the discriminant vectors by maximizing the between-class scatter and 
minimizing the within-class scatter simultaneously. It is effective in extracting discriminative features 
and reducing dimensionality. Many methods have been developed to improve the performance of LDA, 
such as enhanced Fisher linear discriminant model (EFM) [11], improved LDA [12], uncorrected 
optimal discriminant vectors (UODV) [13], discriminant common vectors (DCV) [14], incremental 
LDA [15], semi-supervised discriminant analysis (SSDA) [16], local Fisher discriminant analysis [17], 
Fisher discrimination dictionary learning [18], and discriminant subclass-center manifold preserving 
projection [19]. 

In recent years, many kernel discriminant methods have been presented to extract nonlinear 
discriminative features and enhance the classification performance of linear discrimination techniques, 
such as kernel discriminant analysis (KDA) [20,21], kernel direct discriminant analysis (KDDA) [22], 
improved kernel Fisher discriminant analysis [23], complete kernel Fisher discriminant (CKFD) [24], 
kernel discriminant common vectors (KDCV) [25], kernel subclass discriminant analysis (KSDA) [26], 
kernel local Fisher discriminant analysis (KLFDA) [27], kernel uncorrected adjacent-class discriminant 
analysis (KUADA) [28], and mapped virtual samples (MVS) based kernel discriminant framework [29]. 

In this paper, we have developed a novel multimodal feature extraction and recognition approach 
based on linear and nonlinear discriminant analysis technique. We adopt the feature fusion strategy, as 
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features play a critical role in multimodal biometric recognition. More specifically, we try to answer the 
question of how to effectively obtain discriminative features from multimodal biometric data. Some 
related works have appeared in the literature. In [1,2], multimodal data vectors are firstly stacked into a 
higher dimensional vector to form a new sample set, from which discriminative features are extracted 
for classification. Yang [3] discussed the feature fusion strategy, that is, parallel strategy and serial 
strategy. The former uses complex vectors to fuse multimodal features, i.e., one modal feature is 
represented as the real part, and the other modal feature is represented as the imaginary part; while the 
latter stacks features of two modals into one feature, which is used for classification. Sun [4] proposed 
a method to learn features from data of two modalities based on CCA, but it has not been utilized in 
biometric recognition, and is not convenient to learn features from more than two modes of data. 

While current methods generally extract discriminative features from multimodal data technically, 
they have rarely considered the data distribution. In this paper, we present an assumption that is 
consistent with the viewpoint of discrimination, that is, in the same feature space, one person's 
different biometric identifier data can form different Gaussians, and thus his overall biometric data can 
be described using mixture-Gaussian models. Although LDA has been widely used in biometrics to 
extract discriminative features, it has the limits that it can only handle the data of one person that forms 
a single Gaussian distribution. However, as we pointed out above, in multimodal analysis, different 
biometric identifier data of one person can form mixture-Gaussians. Fortunately, subclass discriminant 
analysis (SDA) [30] has been proposed to remove such a limit of LDA, and therefore could be used to 
describe multimodal data that lie in the same input space. 

Based on the analysis above, in this paper we propose a novel multimodal biometric data feature 
extraction scheme based on subclass discriminant analysis (SDA) [20]. For simplicity, we consider two 
typical types of biometric data, that is, face data and palmprint data. For one person, his face data and 
palmprint data are regarded as two subclasses of one class, and discriminative features are extracted by 
seeking an embedded space, where the difference among subclasses belonging to different persons is 
maximized, and the difference within each subclass is minimized. Then, since the parallel fusion 
strategy is not suitable to fuse features from multiple modals, we fuse the obtained features by 
adopting the serial fusion strategy and use them for classification. 

Two solutions are presented to solve the small sample size problem encountered in calculating the 
optimal transform. One is to initially do PCA preprocessing, and the other is to employ the generalized 
singular value decomposition (GSVD) [31,32] technique. Moreover, it is still worthy to explore the 
non-linear discriminant capability of SDA in multimodal feature fusion, in particular, when some 
single-modals still show complicated and non-linearly separable data distribution. Hence, in this paper, 
we further extend SDA feature fusion approach in the kernel space and present two solutions to solve 
the small sample size problem, which are KPCA-SDA and KSDA-GSVD. In KPCA-SDA, we first use 
KPCA to transform each single modal input space R n into an m-dimensional space, where m = rank(X), 
K is the centralized Gram matrix. Then SDA is used to fuse the two transformed features and extract 
discriminative features. In KSDA-GSVD, we directly perform Kernel SDA to fuse multimodal data by 
applying GSVD to avoid the singular problem. 

We evaluate the proposed approaches on two face databases (AR and FRGC), and the PolyU 
palmprint database, and compare the results with related methods that also tend to extract descriptive 
features from multimodal data. Experimental results show that our approaches achieve higher 



Sensors 2012, 12 



5554 



recognition rates than compared methods, and also get better verification performance than compared 
methods. It is worthwhile to point out that, although the proposed approaches are validated on data of 
two modalities, it could be easily extended to multimodal biometric data recognition. 

The rest of this paper is organized as follows: Section 2 describes the related work. Section 3 
presents our approach. In Section 4, we present the kernelization of our approach. Experiments and 
results are given in Section 5 and conclusions are drawn in Section 6. 

2. Related Work 

In this section, we first briefly introduce some typical multimodal biometrics fusion techniques such 
as pixel level fusion [1,2], Yang's serial and parallel feature level fusion methods [3]. Further, three 
related methods, which are SDA, KSDA and KPCA, are also briefly reviewed. 

2.1. Multimodal Fusion Scheme at the Pixel Level 

The general idea of pixel level fusion [1,2] is to fuse the input data from multi-modalities in as early 
as the pixel level, which may lead to less information loss. The pixel level fusion scheme fuses the 
original input face data vector and palmprint data vector of one person, and then the discriminant 
features are extracted from the fused dataset. For simplicity and fair comparison, we testified the 
effectiveness of such scheme by extracting LDA features from the fused set in this paper. 

2.2. Serial Fusion Strategy and Parallel Fusion Strategy 

In [3], Yang et al. the authors discussed two strategies to fuse features of two data modes. One is 
called serial strategy and the other is called parallel strategy. Let jc,-, yt denote the face feature vector 
and palmprint feature vector of the i th person, respectively. The serial fusion strategy obtains the fused 
features by stacking two vectors into one higher dimensional vector a z , i.e.: 

On the other hand, the parallel fusion strategy combines the features into a complex vector /?,-, i.e., 

Pi=Xi+i'yi (2) 

Yang et al. also pointed out that the fused feature set {a z } and can either be used directly for 
classification, which is called feature combination, or can be input into a feature extractor to further 
extract more descriptive features with less redundant information, which is called feature fusion. 

2.3. Subclass Discriminant Analysis (SDA) and Its Kernelization 

Subclass discriminant analysis (SDA) [30] is an extension of LDA, which aims at processing data 
of one class that form mixture Gaussian distribution. It divides each class into a number of subclasses, 
and calculates a transform space where the distances between both class means and subclass means 
are maximized, and distances between samples of each subclass is minimized. SDA redefines the 
between-class scatter within-class scatter IV as: 
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C-l H, C H k T 
*B = Z Z Z Z W> K - A* )(>"«/ - A/ ) (3) 

i = \ j=\ k=i+\ 1=1 

^^tti^-vM-Hjf (4) 



i=i j=\ k=\ 

where H t is the number of subclasses of class /, py = n^/n is the prior of the j th subclass of class /, fiy is 
the mean of the j th subclass of class /. The advantage of this new definition of between class scatter is 
that it emphasizes the role of class separability over that of intra-subclass scatter. The optimal solution 
of SDA is the eigenvectors of matrix (5V) _1 Xg associated with the largest eigenvalues. 

Kernel subclass discriminant analysis (KSDA) is the nonlinear extension of SDA based on kernel 
functions [26]. The main idea of the kernel method is that without knowing the nonlinear feature 
mapping explicitly, we can work on the feature space through kernel functions. It first maps the input 
data x into a feature space F by using a nonlinear mapping 0. KSDA adopts nonlinear clustering 
technique to find the underlying distributions of datasets in the kernel space. The between-class scatter 
matrix S^ DA and within-class scatter matrix S^p DA of KSDA are defined as: 

C-l H t C H k 



S KSDA = ZZ Z YuPijPu K " "A/) (5) 

i=\ j=\ k=i+\ 1=1 

s ( K i A =-tti(4 (6) 



i=i j=\ k=\ 

where 0 tJ indicates the mean vector of j th subclass of i th class, 0 is the global mean. Like SDA, KSDA 



tries to maximize the ratio 



V J KSDA V 



V KSDA V 



to find a transformation matrix V. The columns of 



Fare the eigenvectors corresponding to the largest eigenvalues of (S^^^^S^^^ 

2.4. Kernel Principle Component Analysis 

In kernel PC A [33], the input data x is mapped into a feature space F via a nonlinear mapping 0 and 
then perform a linear PC A in F. To be specific, we centralize the mapped data as YJi=i 0(*i) = 0 firstly, 
where M is the number of input data. Then the covariance matrix of the mapped data 0(x z ) is defined 
as follows: 

M 

C = l/M£^(x,.)^(x,) r (7) 

z'=l 

Like PCA, the eigenvalue equation kV — CT must be solved for eigenvalue k > 0 and eigenvector 
VE F\{0}. We can prove that all the solutions Flie in the space spanned by 0(xi),... 0(x M ). Therefore, 
we may consider the equivalent system: 

k(<p(x k ) V) = (0(x k ),CV) forall k = l,..M (8) 

and V can be represented as the linear combination of the mapped data 0(x z ): coefficients a\ 9 ...a,M 
such that: 

M 

V = Y, a ^( x i) (9) 



where <x\,...<xm denotes the coefficients. Substituting Equations (8) and (9) into Equations (7), and 
defining an M x M matrix K by: 
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(10) 



we arrive at: 



lXKa = K 2 a 



(11) 



where a denotes the column vector with entries ai 9 ...a,M 9 K is defined as the kernel matrix. To find 
solutions of Equation (1 1) we can solve the equivalent eigenvalue problem as follows: 



for nonzero eigenvalues and obtain the optimal a. Finally, we can project mapped 0(x z ) onto V by 
using a to get the KPCA-transformed features [33]. 

3. Subclass Discriminant Analysis (SDA) Based Multimodal Biometric Feature Extraction 

In this section, we propose a novel multimodal biometric feature extraction scheme based on SDA. 
Two solutions are separately introduced to avoid the singular problem in SDA, which are PCA and 
GSVD. Then we present the algorithm procedures of the proposed SDA-PCA and SDA-GSVD 
approaches. 

3.1. Problem Formulation 

For simplicity, we take two typical types of biometric data as examples in this paper. One is the 
face data, and the other is the palmprint data. From the viewpoint of discrimination, it is quite natural 
to assume that the overall biometric data one person may be regarded as one class. Moreover, his 
palmprint and face data can be regarded as two subclasses of this class in the same feature space. 
An example of two person's face and palmprint samples is shown in Figure 1. 

Figure 1. Illustration of mix-Gaussian distribution of face data and the corresponding 
palmprint data. In this example, data of two persons are presented. Each contains 12 data, 
including six faces and s palmprints. We perform PCA on original data for demonstration, 
and the order of data magnitude is le4. 



£Aa = Ka 



(12) 



0 



o Face data of person 1 
x Palm data of person 1 
o Face data of person 2 




/ x 

i •• - 



Palm data of person 2 



-0.4 




-0.8 



-1.2 



-1.6 



-2 



-1 



0 



2 



Sensors 2012, 12 



5557 



As can be seen from Figure 1, identifier samples of one person show typical mix-Gaussian 
distribution, i.e., the face data cluster together and form a Gaussian, while the palmprint data form 
another Gaussian. If we apply traditional LDA, which enforces both of face and palmprint data of one 
person to cluster together, then data of two persons would be very likely overlap in the embedded 
space. It is apparent that, in Figure 1, SDA is a better descriptor of such a data distribution. 

Let Xft and xf 2 be the k th face sample and palmprint sample of person /, respectively; n c represent 
the sample number of each subclass. Then we construct the between-subclass scatter matrix Sb and 
within-subclass scatter matrix SV as follows: 



C-l L C L 

s b = X X X X PvPu ten - )(My - Ma Y (13a) 

i=i j=\ t=i+\ i=i 

s w =^XXX(^ -vM -^f (i3b) 



where N=c xn c , p tj =p u = nJN, /Jty = Y^ c =1 Xij/n c . 

Let w be the optimal transform vector to be calculated, and then it can be obtained by: 

w T S R w 

max r c (14) 

w w S w w v y 

The within-class matrix Sw is usually singular, and the solution cannot be calculated directly. We 
present two solutions below to solve this problem, i.e., SDA-PCA and SDA-GSVD. 

3.2. SDA-PCA 

The first solution is to first apply PC A to project each image xfj into a lower dimensional space, 
and then apply SDA to do feature extraction. By employing the Lagrange multipliers method to solve 
the optimization problem (15), we could obtain the optimal solution Wsda, i^., the eigenvectors of 
matrix (Sw)~ 1 Sb associated with the largest eigenvalues. 

Based on Formula (14), the rank of Sw is n - 2c, where n represents the total number of training 
samples (including face and palmprint images), and c represents the number of persons. Therefore, we 
can project original samples into a subspace whose dimension is no more than n - 2c, and then apply 
SDA to extract features. 

Let Wp CA , Wp CA separately denote the initial PC A transformations of the sample set of each modal, 
and Wsda denote the later SDA transform. Then the final transformations for each modal are expressed as: 

W^KcaWsoa (15) 

W 2 =W 2 PCA W SDA (16) 

After the optimal transformations W x and W 2 are obtained, we project the face sample xf x and 
palmprint sample xf 2 on them: 

ya=W^,y^=W^ 2 (17) 



Then, features derived from face and palmprint are fused used using serial fusion strategy and used 
for classification: 



A 



(18) 
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3.3. SDA-GSVD 

While PCA is a popular way to overcome the singular problem and accelerate computation, it may 
cause information loss. Therefore, we present a second way to overcome the singularity problem by 
employing GSVD. First, we rewrite the between-class scatter matrix and within-class scatter matrix 
as follows: 

S B =H b H T b ,S w =H w H T w (19) 
Hb is obtained by transforming formula (13) as follows: 

c-l 2 c 2 

S b = ZZ Z UPtjPuiUj -M kl )(M,j -Muf 

i=\ y=l k=i+\ 1=1 

c-l 2 c 2 

= A/^/ZZ^-'K - Z Ea/]- (20) 

/=1 y=l fr=/+l /=1 

[2(c-o// r li// H r 

fr=/+l /=1 

Compared with Equation (21), is defined as: 

H b=[ H (c-\)\> H (c-\)2> H (c-2)\>--> H \\> H \l\ (21) 

c 2 

where # (c _ m)n = 2(c - N)M mn - X 2X • 



According to Equation (14), we can easily achieve H w : 

H w=i x l -Mij,4 -Mi,,-Xy -Mijl=i,..,j=i,2 (22) 

Then, we employ GSVD [31,32] to calculate the optimal transform, and the procedures are given in 
Algorithm 1. 

Algorithm 1. Procedures of GSVD based LDA. 
Step 1: Define matrix K = [H^ H W ] T , and compute the complete orthogonal decomposition 

Step 2: Compute G by performing SVD on matrix P(l :c,l:t), i.e., U T P(l : c,l : t)G = H A , where t is the 
rank of K. 

j. Put the first c-l columns of M into matrix W. Then, 

JFis the optimal transform matrix. 

Then, face data xf x and palmprint data xf 2 are separately projected on W and fused using serial 
fusion strategy: 







~w T 4~ 


A. 




W T xf 2 _ 



(23) 



yf is then used for classification. 
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3.4. Algorithmic Procedures 

In this section, we summarize the complete algorithmic procedures of the proposed approach. In 
practice, if the dimension of two biometric data xf ± and xf 2 are not equal, we could simply pad the 
lower-dimensional vector with zeros until its dimension is equal to the other one before fusing them 
using SDA. In case of SDA-PCA, after PC A projection, it is easy guarantee that x\ x and xf 2 have the 
same dimension if we select the same number of principal components for them. 

Figure 2. The complete procedures of SDA based multimodal feature extraction. 
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Figure 2 displays the complete procedure of the proposed approach for multimodal biometric 
recognition. It is worthwhile to note that, on one hand, our approach outputs features of each modal 
separately, which is convenient for later processing; on the other hand, discriminative information of 
different modals have been initially fused in the extraction process, since their features are extracted 
from the same input space and the transformed space also consider the distribution of data of other 
modals. Therefore, we think this approach can effectively obtain fused discriminative information 
from multimodal data. 
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4. SDA Kernelization Based Multimodal Biometric Feature Extraction 

In this section, we provide the nonlinear extensions of two SDA based multimodal feature 
extraction approaches, which are named KPCA-SDA and KSDA-GSVD. In KPCA-SDA, we first 
apply Kernel PCA on each single modal before performing SDA. While in KSDA-GSVD, we directly 
perform Kernel SDA to fuse multimodal data by applying GSVD to avoid the singular problem. 

4.1. KPCA-SDA 

In this subsection, the SDA-PCA approach is performed in a high dimension space by using the 
kernel trick. We realized the KPCA-SDA in the following steps: 

(1) Nonlinear mapping. 

Let 0: R d —> F denote a nonlinear mapping. The original samples xf t and xf 2 of two modalities 
(face and palmprint) are injected into F by 0: xf ± — ► 0(x^), xf 2 — ► 0(X^). We obtain two sets of 
mapped samples = {0(^0, 0(^ 2 i) 0(*£)}, ^2 = {0(^ 2 ), 0(*i 2 2 )v.-, 0(*£)}. 

(2) Perform KPCA for each single modal database . 

For the j th modal, we perform KPCA by maximizing the following equation: 

J(K») = KS S "K» (24) 

where S? \^x fi )-m^) T , andm ; 0 is the global mean of the f modal database in the 

i=\ k=\ 

kernel space. 

According to the kernel reproducing theory [34], the projection transformation wj^ pca in F can be 
linearly expressed by using all the mapped samples: 

<=tiX**S)=*y«y (25) 

1=1 k=\ 

where a. = (a^. , a 2 Xj , • • a£ ) r is a coefficient matrix. 

Substituting Equation (26) into Equation (25), we have: 

j(y/ ) = ayy j ^y j a j = a]K J K jT a j (26) 

where K J = ^Vf^Vj , which indicates an TV x TV non-symmetric kernel matrix whose element is 
K J m n = {^>{x. m ),<f>(x f fj ? where N denotes the total number of the samples, Xjm denotes the m th sample 

of the j th modal database. 

The solution of Equation (27) is equivalent to the eigenvalue problem: 

X i a j =K j K , j a j (27) 

The optimal solutions a,j = (a/i, ap,..., aj(N-c)) T are the eigenvectors corresponding to TV - c largest 
eigenvalues of KjKj . We project the mapped training sample set ^ on w^ pca by: 

zLa = Kca^j = Vj^. = a]K j (28) 
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(3) Calculate kernel discriminant vectors in the KPCA transformed space. 

By using the KPCA transformed sample set Z J KPCA9 we reformulate Equations (13) and (14) as: 



c-l 2 c 2 



i=\ 7=1 k=i+l 1=1 

IV i=\ j=\ k=l 



(29) 
(30) 



7,0 /0 C 0 

where Zy is the sample in Z J KPCA , and jut = ^z* / w c . 

We can obtain a set of nonlinear discriminant vectors W^^, the eigenvector of matrix (5^) 
associated with the largest eigenvalues. 

(4) Construct the nonlinear projection transformation and do classification. 
We then construct the nonlinear projection transformation W J0 as: 



W f =w( pca Wt DA 

After the optimal transform is obtained, the fused features can be generated as: 



y - 



(31) 



(32) 



4.2. KSDA-GSVD 



In this subsection, the SDA-GSVD is performed in a high dimension space by using the kernel trick. 
Given two sets of mapped samples ¥1 = {0(x^)> 0(x^),..., 0(x%)}, ^2 = {0(*i 2 ), 0(*i 2 )v, 
0( x c 2 c )}, that correspond to face and palmprint modalities, respectively. Afterwards, Hb and H w are 
recalculated in the kernel space: 



,H* 



(c-l)2 (c-2)l 

2> 



11 J n J 



where H*^ =2{c-N)^ - £ , and //, =X^)/; 



(33) 
(34) 



n c (35) 

£= m +l 1=1 k=\ 

Then, we apply GSVD to calculate the optimal transformation so that the singular problem is 
avoided. The procedures are precisely introduced in Algorithm 1. When the optimal W® is obtained, 
the fused features can be generated as: 



y0 — 



~y k n 










A. 




_^>(4)_ 







(36) 



Finally, the nearest neighbor classifier with cosine distance is employed to perform classification. 
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5. Experiments 

In this section, we compare the proposed multimodal feature extraction approaches with single 
modal method and several representative multimodal biometric fusion methods. The identification and 
verification performance of our approaches and other compared methods is evaluated on two face 
databases and one palmprint database. 

5.1. Introduction of Databases 

Two public face databases (AR and FRGC) and one public palmprint database (PolyU palmprint 
database) are employed to testify our proposed approaches. The AR face database [35] contains over 
4,000 color face images of 126 people (70 men and 56 women), including frontal views of faces with 
different facial expressions, under different lighting conditions and with various occlusions. Most of 
the pictures were taken in two sessions (separated by two weeks). Each session yielded 13 color 
images, with 119 individuals (65 men and 54 women) participating in each session. We selected 
images from 119 individuals for use in our experiment for a total number of 3,094 (=119 x 26) 
samples. All color images are transformed into gray images and each image was scaled to 60 x 60 with 
256 gray levels. Figure 3 illustrates all of the samples of one subject. 



The FRGC database [36] contains 12,776 training images that consist of both controlled images and 
uncontrolled images, including 222 individuals, each 36-64 images for the FRGC Experiment 4. The 
controlled images have good image quality, while the uncontrolled images display poor image quality, 
such as large illumination variations, low resolution of the face region, and possible blurring. It is these 
uncontrolled factors that pose the grand challenge to face recognition performance. We use the training 
images of the FRGC Experiment 4 as our database. We choose 36 images of each individual and then 
crop every image to the size of 60 x 60. All images of one subject are shown in Figure 4. 

The palmprint database [37,38], which is provided by the Hong Kong Polytechnic University (HK 
PolyU), collected palmprint images from 189 individuals. Around 20 palmprint images from each 
individual were collected in two sessions, where around 10 samples were captured in the first session 
and the second session, respectively. Therefore, the database contains a total of 3,780 images from 
189 palms. In order to reduce the computational cost, each subimage was compressed to 60 x 60. We 



Figure 3. Demo images of one subject from the AR face database. 
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took these subimages as palmprint image samples for our experiments. All cropped images of one 
subject in Figure 5. 



Figure 4. Demo images of one subject from the FRGC face database. 




In order to testify the proposed fusion techniques, in the experiment which we fuse AR database 
and PolyU palmprint database, we choose 119 subjects from both face and palmprint database, and 
each class contains 20 samples. Similarly, in the experiment which we fuse FRGC database and PolyU 
palmprint database, we choose 189 subjects from both face and palmprint database, and each class 
contains 20 samples. We assume that samples of one subject in the palmprint database correspond to 
the samples of one subject in the face database. For the AR face database and PolyU palmprint 
database, we randomly select eight samples from each person (four face samples from AR database 
and four palmprint samples from PloyU database) for training, while use the rest for testing. For the 
FRGC face database and PolyU palmprint database, we randomly select six samples from each person 
(three face samples from FRGC database and three palmprint samples from PloyU database) for 
training, while use the rest for testing. We run all compared methods 20 times. In our experiments, we 
consider the Gaussian kernel k(x, y) = exp(-||x- yf jldf) for the compared kernel methods, and set the 
parameter Si = i x S, i E l,—,20, where S is the standard deviation of training data set. For each 
compared kernel method, the parameter / was selected such that the best classification performance 
was obtained. 
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5.2. Experimental Identification Results 

Firstly, the identification experiments are conducted. Identification is a one-to-many comparison 
which aims to answer the question of "who is this person?" We compare the identification 
performance of two proposed approaches, i.e., SDA-PCA (which is abbreviated to SDA here), 
SDA-GSVD, with single modal recognition method using traditional LDA, a representative pixel level 
fusion method [1], parallel and serial feature level fusion [3], and score level fusion method using the 
sum rule [7], respectively. Further, we compare the proposed kernelizaion methods (KPCA-SDA and 
KSDA-GSVD), with single modal recognition method using KDA. Figures 6 and 7 show the 
recognition rates of 20 random tests of our approaches and other compared methods: (a) SDA, 
SDA-GSVD, LDA (single modal), Pixel level fusion, parallel feature fusion, Serial feature fusion and 
Score level fusion; (b) KPCA-SDA, KSDA-GSVD and KDA (single modal). The average recognition 
rates are given in Tables 1 and 2, which correspond to the figures above. 



Figure 6. Recognition rates of compared methods on AR face and PolyU palmprint 
databases: (a) Linear methods; (b) Nonlinear methods. 




Random Testing No. 



(b) 
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Figure 7. Recognition rates of compared methods on FRGC face and PolyU palmprint 
databases: (a) Linear methods; (b) Nonlinear methods. 
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Table 1. Average recognition rates of compared methods on AR face and PolyU palmprint databases 



AR and palmprint Average recognition rates (%) 



Single modal recognition 


ARLDA 


75.09 ± 7.39 


Palmprint LDA 


82.26 ± 3.50 




Pixel level fusion [1] 


95.35 ± 4.50 




Parallel feature fusion [3] 


92.48 ± 2.61 


Multimodal recognition 


Serial feature fusion [3] 


90.71 ± 3.06 


92.99 ± 2.63 


Score level fusion [7] 




SDA based feature extraction 


96.52 ± 1.16 




SDA-GSVD based feature extraction 


98.23 ± 0.68 


(a) Linear methods 


AR and palmprint 


Average recognition rates (%) 


Single modal recognition 


ARKDA 


79.50 ± 6.83 




Palmprint KDA 


83.45 ± 4.47 


Multimodal recognition 


KPCA-SDA 


98.74 ± 0.45 




KSDA-GSVD 


99.15 ± 0.63 



(b) Nonlinear methods 
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Table 2. Average recognition rates of compared methods on FRGC face and PolyU 
palmprint databases. 



FRGC and palmprint Average recognition rates (%) 



Single modal recognition 


FRGC LDA 


78.26 


± 


4.53 


Palmprint LDA 


80.22 


± 


3.26 




Pixel level fusion [1] 


97.21 


± 


2.89 




Parallel feature fusion [3] 


94.92 


± 


2.17 


Multimodal recognition 


Serial feature fusion [3] 


94.54 


± 


1.57 


Score level fusion [7] 


95.59 


± 


4.70 




SDA based feature extraction 


98.06 


± 


1.09 




SDA-GSVD based feature extraction 


98.61 


± 


0.99 



(a) Linear methods 



FRGC and palmprint 


Average recognition rates (%) 


Single modal recognition 
Multimodal recognition 


AR KDA 
Palmprint KDA 
KPCA-SDA 
KSDA-GSVD 


80.44 ± 2.57 
81.23 ± 3.26 
98.82 ± 0.32 
99.02 ± 0.31 



(b) Nonlinear methods 



Table 1 shows that on the AR and PolyU palmprint databases, SDA and SDA-GSVD perform better 
than other compared linear methods. It also shows that KPCA-SDA and KSDA-GSVD achieve better 
recognition results than KDA (single modal). Compared with the single modal LDA, pixel level fusion, 
parallel feature fusion, parallel feature fusion, serial feature fusion and score level fusion, SDA 
improves the average recognition rate at least by 3.53% (=98.23%-92.99%), SDA-GSVD improves 
the average recognition rate at least by 5.24% (=98.23%-92.99%). And the average recognition rate of 
KPCA-SDA is at least 15.29% (=98.74%-83.45%) higher than that of KDA (single modal), and the 
average recognition rate of KSDA-GSVD is at least 15.7% (=99.15%-83.45%) higher than that of 
KDA (single modal). Table 2 shows a similar phenomenon on the FRGC and PolyU palmprint 
databases. SDA boosts the average recognition rate at least by 0.85% (=98.06%-97.21%), and 
SDA-GSVD boosts the average recognition rate at least by 1.40% (=9 8.61 %-97 . 2 1 %) than other linear 
methods. The average recognition rate of KPCA-SDA is at least 17.59% (=98.82-81.23) higher than 
that of KDA (single modal), and the average recognition rate of KSDA-GSVD is at least 17.79% 
(=99.02%-81.23%) higher than that of KDA (single modal). 

5.3. Experimental Results of Verification 

Verification is a one-to-one comparison which aims to answer the question of "whether the person 
is one he/she claims to be". In the verification experiments, we show the receiver operating 
characteristic (ROC) curves, which plot the false rejection rate (FRR) versus the false accept rate 
(FAR), to report the verification performance. There is a tradeoff between the FRR and the FAR. It is 
possible to reduce one of them with the risk of increasing the other one. Thus the curve which is called 
receiver operating characteristic (ROC) reflects the tradeoff between the FAR and FRR, and FRR is 
plotted as a function of FAR. 
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Figures 8 and 9 show the Receiver Operating Characteristic (ROC) curves of our approaches and 
other compared methods on different databases. Table 3 shows the equal error rate (EER) of all 
compared methods. From the ROC curves shown in Figures 8-9 and the results listed in Table 3, we 
can see that our SDA based feature extraction approaches attains a significantly low EER (a point on 
the ROC curve where FAR is equal to FRR) than other representative multimodal fusion methods, 
including pixel level fusion method, score level fusion method and feature level fusion methods. On 
the AR face and PolyU palmprint databases, the lowest EER of related methods is 3.71%, while the 
EER of our approaches are all below 1%. And our KSDA-GSVD approach obtains the lowest 
EER 0.56% among all compared methods. On the FRGC face and PolyU palmprint databases, the 
lowest EER of other methods is 2.62%, while the EER of ours are all below 2%. Especially, the 
proposed SDA-GSVD approach gets the lowest EER that is 0.28%. The above experimental results 
demonstrate the superiority of our approaches. 

Figure 8. ROC curves of all compared methods on AR face and PolyU palmprint 
databases: (a) Linear methods; (b) Nonlinear methods. 
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(b) 
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Figure 9. ROC curves of all compared methods on FRGC face and PolyU palmprint 
databases: (a) Linear methods; (b) Nonlinear methods. 
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(b) 

Table 3. The equal error rate (EER) of all compared methods on different databases. 





Method 


AR and PalmprintEER (%) 


FRGC and Palmprint EER (%) 


Single modal 


Face LDA 


15.45 


8.13 


recognition 


Palmprint LDA 


4.32 


3.14 




Face KDA 


6.13 


5.72 




Palmprint KDA 


8.36 


10.85 


Multimodal 


Pixel level fusion [1] 


3.95 


3.25 


recognition 


Parallel feature fusion [3] 


3.71 


3.27 




Serial feature fusion [3] 


7.84 


4.41 




Score level fusion [7] 


5.12 


2.62 




SDA based feature extraction 


0.83 


1.05 




SDA-GSVD based feature extraction 


0.72 


0.28 




KSDA based feature extraction 


0.87 


1.90 




KSDA-GSVD based feature extraction 


0.56 


0.84 
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6. Conclusions 

In this paper, we present novel multimodal biometric feature extraction approaches using subclass 
discriminant analysis (SDA). Considering the nonsingularity requirements, we present two ways to 
overcome this problem. The first is to initially do principle component analysis before SDA, and the 
second is to employ generalized singular value decomposition (GSVD) to directly obtain the solution. 
Further, we present the kernel extensions (KPCA-SDA and KSDA-GSVD) for multimodal biometric 
feature extraction. We perform the experiments on two public face databases (i.e., AR face database 
and FRGC database) and the PolyU palmprint database. In designing the experiments, we firstly do 
extraction on the AR and palmprint database, secondly on the FRGC and palmprint database. 
Compared with several representative linear and nonlinear multimodal biometrics recognition 
methods, the proposed approaches acquire better identification and verification performance. In 
particular, the proposed KSDA-GSVD approach performs best on all the databases. 
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